Goto

Collaborating Authors

 colorization method




Prompt-based Consistent Video Colorization

Dani, Silvia, Uricchio, Tiberio, Seidenari, Lorenzo

arXiv.org Artificial Intelligence

Existing video colorization methods struggle with temporal flickering or demand extensive manual input. We propose a novel approach automating high-fidelity video colorization using rich semantic guidance derived from language and segmentation. We employ a language-conditioned diffusion model to colorize grayscale frames. Guidance is provided via automatically generated object masks and textual prompts; our primary automatic method uses a generic prompt, achieving state-of-the-art results without specific color input. Temporal stability is achieved by warping color information from previous frames using optical flow (RAFT); a correction step detects and fixes inconsistencies introduced by warping. Evaluations on standard benchmarks (DAVIS30, VIDEVO20) show our method achieves state-of-the-art performance in colorization accuracy (PSNR) and visual realism (Colorfulness, CDC), demonstrating the efficacy of automated prompt-based guidance for consistent video colorization.


Supplmentary Material: L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors

Neural Information Processing Systems

To demonstrate the effectiveness of our proposed luminance-guided image compression, semantic-aligned latent representation, and instance-aware sampling strategy (details in Sec. We demonstrate our generalization capability by showing more colorization results on legacy black-and-white photos in Figure 1, where results are presented sequentially from left to right using descriptions at the complete, partial, and scarce levels. Learning to color from language.



Transforming Color: A Novel Image Colorization Method

Shafiq, Hamza, Lee, Bumshik

arXiv.org Artificial Intelligence

This paper introduces a novel method for image colorization that utilizes a color transformer and generative adversarial networks (GANs) to address the challenge of generating visually appealing colorized images. Conventional approaches often struggle with capturing long-range dependencies and producing realistic colorizations. The proposed method integrates a transformer architecture to capture global information and a GAN framework to improve visual quality. In this study, a color encoder that utilizes a random normal distribution to generate color features is applied. These features are then integrated with grayscale image features to enhance the overall representation of the images. Our method demonstrates superior performance compared with existing approaches by utilizing the capacity of the transformer, which can capture long-range dependencies and generate a realistic colorization of the GAN. Experimental results show that the proposed network significantly outperforms other state-of-the-art colorization techniques, highlighting its potential for image colorization. This research opens new possibilities for precise and visually compelling image colorization in domains such as digital restoration and historical image analysis.


L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors

Chang, Zheng, Weng, Shuchen, Zhang, Peixuan, Li, Yu, Li, Si, Shi, Boxin

arXiv.org Artificial Intelligence

Language-based colorization produces plausible and visually pleasing colors under the guidance of user-friendly natural language descriptions. Previous methods implicitly assume that users provide comprehensive color descriptions for most of the objects in the image, which leads to suboptimal performance. In this paper, we propose a unified model to perform language-based colorization with any-level descriptions. We leverage the pretrained cross-modality generative model for its robust language understanding and rich color priors to handle the inherent ambiguity of any-level descriptions. We further design modules to align with input conditions to preserve local spatial structures and prevent the ghosting effect. With the proposed novel sampling strategy, our model achieves instance-aware colorization in diverse and complex scenarios. Extensive experimental results demonstrate our advantages of effectively handling any-level descriptions and outperforming both language-based and automatic colorization methods. The code and pretrained models are available at: https://github.com/changzheng123/L-CAD.


Incorporating Ensemble and Transfer Learning For An End-To-End Auto-Colorized Image Detection Model

Ragab, Ahmed Samir, Taie, Shereen Aly, Abdelnaby, Howida Youssry

arXiv.org Artificial Intelligence

Image colorization is the process of colorizing grayscale images or recoloring an already-color image. This image manipulation can be used for grayscale satellite, medical and historical images making them more expressive. With the help of the increasing computation power of deep learning techniques, the colorization algorithms results are becoming more realistic in such a way that human eyes cannot differentiate between natural and colorized images. However, this poses a potential security concern, as forged or illegally manipulated images can be used illegally. There is a growing need for effective detection methods to distinguish between natural color and computer-colorized images. This paper presents a novel approach that combines the advantages of transfer and ensemble learning approaches to help reduce training time and resource requirements while proposing a model to classify natural color and computer-colorized images. The proposed model uses pre-trained branches VGG16 and Resnet50, along with Mobile Net v2 or Efficientnet feature vectors. The proposed model showed promising results, with accuracy ranging from 94.55% to 99.13% and very low Half Total Error Rate values. The proposed model outperformed existing state-of-the-art models regarding classification performance and generalization capabilities.


Attention-Aware Anime Line Drawing Colorization

Cao, Yu, Tian, Hao, Mok, P. Y.

arXiv.org Artificial Intelligence

Automatic colorization of anime line drawing has attracted much attention in recent years since it can substantially benefit the animation industry. User-hint based methods are the mainstream approach for line drawing colorization, while reference-based methods offer a more intuitive approach. Nevertheless, although reference-based methods can improve feature aggregation of the reference image and the line drawing, the colorization results are not compelling in terms of color consistency or semantic correspondence. In this paper, we introduce an attention-based model for anime line drawing colorization, in which a channel-wise and spatial-wise Convolutional Attention module is used to improve the ability of the encoder for feature extraction and key area perception, and a Stop-Gradient Attention module with cross-attention and self-attention is used to tackle the cross-domain long-range dependency problem. Extensive experiments show that our method outperforms other SOTA methods, with more accurate line structure and semantic color information.